12. Function Approximation
Function Approximation
Function Approximation
Given a problem domain with continuous states s \in \mathcal{S} = {\mathbb{R}^{n}}, we wish to find a way to represent the value function v_{\pi}(s) (for prediction) or q_{\pi}(s, a) (for control).
We can do this by choosing a parameterized function that approximates the true value function:
\hat{v}(s, \mathbf{w}) \approx v_{\pi}(s)
\hat{q}(s, a, \mathbf{w}) \approx q_{\pi}(s, a)
Our goal then reduces to finding a set of parameters \mathbf{w} that yield an optimal value function. We can use the general reinforcement learning framework, with a Monte-Carlo or Temporal-Difference approach, and modify the update mechanism according to the chosen function.
Feature Vectors
A common intermediate step is to compute a feature vector that is representative of the state:
\mathbf{x}(s)